专利摘要:
SYSTEM AND METHOD FOR STANDARDIZING NON-DESTRUCTIVE NOISE OF AUDIO SIGNALS IN PORTABLE DEVICES. The present invention relates to many portable playback devices that cannot decode and reproduce encoded audio content that has wide bandwidth and wide dynamic range with consistent intelligibility and intensity, unless encoded audio content has been prepared. specifically for those devices. This problem can be overcome by including, in the encoded content, some metadata that specify a suitable compression profile of dynamic extension through absolute values or differential values in relation to another compression profile. A playback device can also adaptively apply gain and limit playback audio. Implements to encoders, transcoders and decoders are revealed.
公开号:BR112012019880B1
申请号:R112012019880-7
申请日:2011-02-03
公开日:2020-10-13
发明作者:Jeffrey C. Riedmiller;Harald H. Mundt;Michael Schug;Martin Wolters
申请人:Dolby Laboratories Licensing Corporation;Dolby International Ab;
IPC主号:
专利说明:

CROSS REFERENCE TO RELATED ORDERS
[0001] This application claims priority of Provisional Patent Application No. U.S. 61 / 303,643, filed on February 11, 2010, incorporated in its entirety into this document for reference. TECHNICAL FIELD
[0002] The present invention generally relates to the encoding and decoding of audio signals and belongs, more specifically, to techniques that can be used to encode and decode audio signals for a wider range of playback devices and environments listening. BACKGROUND TECHNIQUE
[0003] The growing popularity of mobile devices and other types of portable devices has created new opportunities and challenges for the creators and distributors of media content for playback on such devices, as well as for the designers and manufacturers of the devices. Many portable devices can play a wide range of media content types and formats, including those often associated with high-bandwidth audio content and high-quality wide dynamic range for HDTV, Blu-ray or DVD. Portable devices can be used to play this type of audio content on their own internal acoustic transducers or on external transducers, such as headphones; however, they generally cannot reproduce that content with consistent intelligibility and intensity across varying content types and media formats. DESCRIPTION OF THE INVENTION
[0004] The present invention is directed to providing improved methods for encoding and decoding audio signals for reproduction on a variety of devices, including mobile devices and other types of portable devices.
[0005] Various aspects of the present invention are presented in the embodiments.
[0006] The various features of the present invention and their preferred embodiments can be better understood with reference to the following discussion and the accompanying drawings in which similar reference numerals refer to similar elements in the various figures. The contents of the following discussion and drawings are presented as examples only and should not be understood as representing limitations under the scope of the present invention. BRIEF DESCRIPTION OF THE DRAWINGS
[0007] Figure 1 is a schematic block diagram of a reproduction device.
[0008] Figure 2 is a schematic block diagram of an encoding device.
[0009] Figures 3 to 5 are schematic block diagrams of transcoding devices.
[00010] Figure 6 is a schematic block diagram of a device that can be used to implement various aspects of the present invention. MODES FOR CARRYING OUT THE INVENTION The introduction
[00011] The present invention is directed to the encoding and decoding of audio information for reproduction in challenging listening environments such as those encountered by users of mobile devices and other types of portable devices. Some examples of audio encoding and decoding are described by published standards such as those described in the "Digital Audio Compression Standard (AC-3, E-AC-3)," Review B, Document A / 52B, June 14, 2005 published by the Advanced Television Systems Committee, Inc. (referred to herein as the "ATSC Standard"), and ISO / IEC 13818-7, Advanced Audio Coding (AAC) (referred to herein as the "Standard MPEG-2 AAC ") and ISO / IEC 14496-3, subpart 4 (referred to herein as" MPEG-4 Audio Standard ") published by the International Standards Organization (ISO). The encoding and decoding processes that conform these standards are mentioned as examples only.The principles of the present invention can be used with coding systems that also conform to other standards.
[00012] The inventors have found that the available resources of devices that conform to some coding standards are often insufficient for applications and listening environments that are typical of mobile devices and other types of portable devices. When these types of devices are used to decode the audio content of encoded input signals that conform to these standards, the encoded audio content is often played at levels of intensity that are significantly lower than the intensity levels for content of audio obtained by decoding encoded input signals that have been specially prepared for playback on these devices.
[00013] Encoded input signals that conform to the ATSC Standard (referred to in this document as "ATSC compliant encoded signals”), for example, contain encoded audio information and metadata that describes how that information can be decoded. metadata parameters identify a dynamic extent compression profile that specifies how the dynamic extent of the audio information can be compressed when the encoded audio information is decoded.The full dynamic extent of the decoded signal can be retained or can be compressed to varying degrees at the time of decoding to satisfy the demands of different applications and listening environments. Other metadata identifies a measure of the intensity of the encoded audio information such as a dialog level or average program level in the encoded signal. These metadata can be used by a decoder to adjust amplitudes of the decoded signal p to achieve a reference reproduction intensity or level specified during playback. In some applications, one or more reference reproduction levels may be specified or assumed, while in other applications the user may be given control over the adjustment of the reference reproduction level. For example, the encoding processes used to encode and decode ATSC-compliant encoded signals assume that dialogue must be reproduced at one of two reference reproduction levels. A level is 31 dB below a clipping level, which is the largest possible digital value or full scale value (FS), denoted in this document as -31 dBps. The decoding mode that uses this level is sometimes referred to as "Line Mode" and is intended for use in applications and environments where wider dynamic extensions are suitable. The other level is set at -20 dBps. The decoding mode that uses this second level is sometimes referred to as "RF Mode," which is intended for use in applications and environments such as those found in diffusion through modulation of radio frequency (RF) signals in which extensions narrower dynamics are needed to avoid excessive modulation.
[00014] For another example, encoded signals that are compatible with the MPEG-2 AAC and MPEG-4 audio standards include metadata that identifies an average intensity level for the encoded audio information. Processes that decode encoded signals compatible with MPEG-2 AAC and MPEG-4 audio can allow the listener to specify a desired playback level. The decoder uses the desired reproduction level and medium intensity metadata to adjust amplitudes of the decoded signal so that the desired reproduction level is achieved.
[00015] When mobile devices and other types of portable devices are used to decode and reproduce the audio content of ATSC compliant, MPEG-2 AAC, and MPEG-4 audio in accordance with these parameters of metadata, dynamic range and intensity level are often unsuitable because of the harsh listening environments that are encountered with these types of devices or because of electrical limitations due to the lower operating voltages used on these devices.
[00016] Encoded signals that conform to other standards use similar types of metadata and may include a provision to specify the desired intensity reproduction level. The same problems are often encountered with portable devices that decode these signals.
[00017] The present invention can be used to enhance the listening experience for users of mobile and portable devices without requiring content that has been specifically prepared for those devices. B. Device Overview
[00018] Figure 1 is a schematic block diagram of a type of a receiver / decoder device 10 that incorporates various aspects of the present invention. Device 10 receives an encoded input signal from signal path 11, applies appropriate processes to deformator 12 to extract encoded audio information and associated metadata from the input signal, passes encoded audio information to decoder 14 and passes metadata to the along signal path 13. The encoded audio information includes encoded subband signals that represent spectral content of auditory stimuli and the metadata specifies values for a variety of parameters that include one or more decoding control parameters and one or more parameters that specify dynamic extension compression according to a dynamic extension compression profile. The term "dynamic extension compression profile" refers to features such as gain factors, compression attack times and compression release times that define the operational characteristics of a dynamic extension compressor.
[00019] Decoder 14 applies a decoding process to encoded audio information to obtain decoded subband signals, which are passed on to dynamic range control 16. The operation and functions of the decoding process can be adapted in response to decoding control parameters received from the signal path 13. Examples of decoding control parameters that can be used to adapt the operation and functions of the decoding process are parameters that identify the number and configuration of the audio channels represented by the information of encoded audio.
[00020] The dynamic span control 16 optionally adjusts the dynamic span of the decoded audio information. This setting can be turned on or off and adapted in response to metadata received from signal path 13 and / or the control signals that can be provided in response to a listener's input. For example, a control signal can be provided in response to a listener operating a switch or selecting an operating option for device 10.
[00021] In deployments that conform to the ATSC Standard, the MPEG-2 AAC standard or the MPEG-4 audio standard, for example, the encoded input signal includes encoded audio information arranged in a sequence of segments or frames. Each frame contains encoded subband signals that represent spectral components of an audio signal with its full dynamic range. The dynamic range control 16 may take no action, which allows the audio signal to be reproduced with a maximum amount of dynamic range, or it may modify the decoded subband signals to compress the dynamic range to varying degrees.
[00022] The synthesis filter bank 18 applies a synthesis filter bank to the decoded subband signals, which may have been adjusted by the dynamic extension control 16, and provides a time domain audio signal at its output which can be a digital or analog signal.
[00023] The gain limiter 20 is used in some implementations of the present invention to adjust the amplitude of the time domain audio signal. The output of the gain limiter 20 is passed along the path 21 for subsequent presentation by an acoustic transducer.
[00024] Figure 2 is a schematic block diagram of an encoder / transmitter device 30 that incorporates various aspects of the present invention. Device 30 receives an audio input signal from signal path 31 which represents auditory stimuli. Device 30 applies an analysis filter bank to the audio signal to obtain subband signals in a frequency domain representation of the incoming audio signal or a set of limited bandwidth signals representing the audio signal input. Metadata calculator 34 analyzes the audio input signal and / or one or more signals derived from the audio input signal such as a modified version of the audio input signal or the subband signals from the analysis filter bank 32 to calculate metadata that specify values for a variety of parameters including encoding control parameters, one or more decoding control parameters, and one or more parameters that specify dynamic extension compression according to a dynamic extension compression profile. The metadata calculator 34 can analyze time domain signals, frequency domain signals, or a combination of time domain and frequency domain signals. The calculations performed by the metadata calculator 34 can also be adapted in response to one or more metadata parameters received from path 33. The encoder 36 applies an encoding process to the output of the analysis filter bank 32 to obtain audio information which include coded subband signals, which are passed to formatter 38. The coding process can be adapted in response to the coding control parameters received from path 33. The coding process can also generate other parameters decoding control along path 33 for use by the processes performed on device 10 to decode the encoded audio information. Formatter 38 joins the encoded audio information and at least some of the metadata including the one or more decoding control parameters and the one or more parameters that specify dynamic extension compression in an encoded output signal that has a format that is suitable for transmission or storage.
[00025] In deployments that conform to the ATSC Standard, the MPEG-2 AAC standard or the MPEG-4 audio standard, for example, the encoded output signal includes encoded audio information arranged in a sequence of segments or frames. Each frame contains encoded subband signals that represent spectral components of an audio signal with its full dynamic range and that have amplitudes for reproduction at a reference reproduction level.
[00026] Deformator 12, decoder 14, synthesis filter bank 18, analysis filter bank 32, encoder 36 and formatter 38 can be conventional in design and operation. Some examples include the corresponding components that conform to the published standards mentioned above. Implements of the components specified or suggested in these standards are suitable for use with the present invention, but are not required. No particular deployment of these components is critical.
[00027] Figures 3 to 5 are schematic block diagrams of different deployments of a transcoder device 40 comprising some of the components in device 10 and device 30, described above. These components operate in substantially the same way as their counterparts. The device 40 shown in Figure 3 is able to transcode the encoded input signal received from path 11 into a modified version that conforms to the same encoding standard. In this deployment, device 40 receives an encoded input signal from signal path 11, applies appropriate processes to deformator 12 to extract the first encoded audio information and associated metadata from the encoded input signal, passes the first encoded audio information to the decoder 14 and formatter 38, and passes the metadata along the signal path 43. The first encoded audio information includes encoded subband signals that represent the spectral content of auditory stimuli and the metadata specifies values for a variety of parameters including one or more decoding control parameters and one or more parameters that specify dynamic extension compression according to a first dynamic extension compression profile. Decoder 14 applies a decoding process to the first encoded audio information to obtain decoded subband signals. The operation and functions of the decoding process can be adapted in response to one or more decoding control parameters received from signal path 43. Subband signals can be a frequency domain representation of the auditory stimuli or a set limited bandwidth signals that represent auditory stimuli.
[00028] Metadata calculator 44 analyzes decoded subband signals and / or one or more signals derived from decoded subband signals to calculate one or more parameter values that specify dynamic extent compression according to a second compression profile of dynamic extension. For example, one or more signals can be derived by applying the synthesis filter bank 18 to the decoded subband signals. The calculations performed by the metadata calculator 44 can be adapted in response to metadata received from trajectory 43. The synthesis filter bank 18 can be omitted from this deployment if its output is not necessary for the metadata calculation.
[00029] Another implantation of device 40 is shown in Figure 4. This implantation is similar to that shown in Figure 3, but includes encoder 36. The inclusion of encoder 36 allows device 40 to transcode the encoded input signal received from path 11 , which conforms to a first coding pattern, in an encoded output signal which conforms to a second coding pattern which can be equal to or different from the first coding pattern as long as the subband signals of the two patterns of coding are compatible. This can be done in this deployment by having the encoder 36 apply an encoding process to the subband signals to obtain second encoded audio information that conforms to the second encoding standard. The second encoded audio information is passed to formatter 38. The encoding process can be adapted in response to metadata received from trajectory 43. The encoding process can also generate other metadata along trajectory 43 for use by the processes executed in the device 10 to decode the encoded audio information. Formatter 38 joins the metadata received from path 43 and the encoded audio information it receives into an encoded output signal that has a format that is suitable for transmission or storage.
[00030] Yet another implantation of device 40 is shown in Figure 5. This implantation includes the synthesis filter bank 18, which is applied to decoded subband signals to obtain a time domain or broadband representation of the information encoded audio. The inclusion of the synthesis filter bank 18 and the analysis filter bank 32 allows the device 40 to transcode between essentially any choice of coding standards. The output of the synthesis filter bank 18 is passed to the analysis filter bank 32, which generates subband signals for encoding by the encoder 36. The encoder 36 applies a coding process to the output of the analysis filter bank 32 to obtain second encoded audio information, which is passed to formatter 38. The encoding process can also generate other metadata along path 43 for use by the processes performed on device 10 to decode the encoded audio information. Metadata calculator 44 can calculate metadata parameter values from its analysis of any or all of the subband signals received from decoder 14, synthesis filter bank output 18, and filter bank output analysis 32.
[00031] Some aspects of device 10 and device 30 are described in more detail below. These descriptions apply to the corresponding features of the device 40. These aspects are described in terms of features and characteristics of methods and devices that conform to the ATSC Standard mentioned above. These specific features and characteristics are discussed by way of example only. The principles underlying these deployments are directly applicable to methods and devices that conform to other standards. C. Receiver / Decoder
[00032] The reproduction problems described above can be solved using one or more of the three different techniques described below. The first technique uses gain limitation and can be deployed by resources only on device 10. The second and third techniques use dynamic extension compression and their deployments require resources on both device 10 and device 30. 1. Gain limiter
[00033] The first technique operates device 10 in RF Mode rather than in Line Mode, so that it decodes an ATSC-compliant encoded input signal with dynamic extension control 16 providing higher levels of extension compression dynamics and a higher reference reproduction level. The gain limiter 20 provides additional gain, increasing the effective reference reproduction level to a value of -14 dBps to -8 dBps. Empirical results indicate that a reference level of -11 dBps offers good results for many applications.
[00034] The gain limiter 20 also applies a limiting operation to prevent the amplified digital signal from exceeding 0 dBps. The operating characteristics of the limiter may affect the perceived quality of the reproduced audio, but no particular limiter is critical to the present invention. The limiter can be deployed in essentially any way that may be desired. Preferably, the limiter is designed to provide a "light" limiting function instead of a "severe" clipping function. 2. Differential Compression Values
[00035] The second technique allows device 10 to apply one or more modified dynamic range compression parameters to dynamic range control 16. Deformator 12 obtains differential dynamic range compression (DRC) parameter values from the encoded input signal and passes the differential parameter values along with conventional DRC parameter values along path 13 to the dynamic span control 16. The dynamic span control 16 calculates the required DRC parameter values or arithmetic by combining the conventional DRC parameter values with corresponding differential DRC parameter values. Gain limiter 20 does not need to be used in this situation.
[00036] Differential DRC parameter values are provided in the input signal encoded by the encoder / transmitter device 30 that generated the encoded input signal. This is described below.
[00037] If the encoded input signal does not contain these differential DRC values, device 10 can use gain limiter 20 according to the first technique described above. 3. Distinct Compression Profile
[00038] The third technique allows device 10 to apply dynamic extension compression according to a new dynamic extension compression profile in dynamic extension control 16. Deformator 12 obtains one or more DRC parameter values for the new profile of the encoded input signal and passes them along trajectory 13 for dynamic range control 16. Gain limiter 20 does not need to be used in this situation.
[00039] The DRC parameter values for the new dynamic extension compression profile are provided in the input signal encoded by the encoder / transmitter device 30 that generated the encoded input signal. This is described below.
[00040] If the encoded input signal does not contain the one or more DRC parameter values for the new DRC profile, device 10 can use gain limiter 20 according to the first technique described above. D. Encoder / Transmitter 1. Differential Compression Values
[00041] The processes for the second technique discussed above are implemented in device 10 through the use of differential DRC parameter values that are extracted from the encoded input signal. These differential parameter values are provided by the device 30 that generated the encoded signal.
[00042] Device 30 provides a set of differential DRC parameter values that represent the difference between a set of DRC parameter values that will be present in the encoded signal and a set of corresponding base parameter values for a new profile of DRC that are required to prevent encoded audio signal samples from exceeding 0 dBps for a higher reference reproduction level. No particular method for calculating DRC parameter values is critical to the present invention. Known methods for calculating parameter values that are compatible with the ATSC Standard are revealed in "ATSC Recommended Practice: Techniques for Estalishing an Maintaining Audio Loudness for Digital Television," Document A / 85, November 4, 2009 published by the Systems Committee Advanced Television, Inc., especially Section 9 and Annex F, and in Robinson et al., "Dynamic Range Control via Metadata," prepress 5028, 107th AES Convention, New York, September 1999.
[00043] If the encoded output signal conforms to the ATSC Standard, the MPEG-2 AAC Standard or the MPEG-4 Audio Standard, the reference reproduction level is increased to a value of -14 dBps to -8 dBps. Empirical results indicate that a reference level of -11 CIBFS offers good results for many applications.
[00044] For ATSC-compliant encoded output signals, metadata calculator 34 calculates a differential parameter value for the corresponding base parameter "compr" specified in the standard. Formatter 38 can join the differential parameter value in portions of each coded signal frame denoted as "addbsi" (additional bit rate information) and / or "auxdata" (auxiliary data). If the differential parameter values are joined in the "addbsi" or "auxdata" portions, the encoded signal will be compatible with all ATSC compatible decoders. Those decoders that do not recognize the differential parameter values can still process and decode the encoded signal frames correctly bypassing the "addbsi" and "auxdata" portions. Refer to document A / 52b mentioned above for more details.
[00045] For encoded output signals compatible with the MPEG-2 AAC or MPEG-4 audio standards, formatter 38 can join the differential parameter values in portions of each coded signal frame denoted as "Fill_Element" or "Data_Stream_Element" both patterns. If the differential parameter values are joined in any of these portions, the encoded signal will be compatible with all MPEG-2 AAC and MPEG-4 audio compatible decoders. Refer to the ISO / IEC 13818-7 and ISO / IEC 14496-3 documents mentioned above for more details.
[00046] Differential parameter values can be calculated and inserted into the encoded signal at an index that is greater than, equal to, or less than the index at which the corresponding base parameter values are in the encoded signal. The index for the differential values may vary. Flags or bits that indicate whether a previous differential value should be reused can also be included in the coded signal. 2. Distinct Compression Profile
[00047] The processes for the third technique discussed above are implemented in device 10 through the use of DRC parameter values for the new dynamic extension compression profile that are extracted from the encoded input signal. These parameter values are provided by the device 30 that generated the encoded signal.
[00048] Device 30 derives DRC parameter values for a new DRC profile by calculating parameter values necessary to prevent decoded audio signal samples from exceeding 0 dBps for a higher reference reproduction level.
[00049] If the encoded output signal conforms to the ATSC Standard, the MPEG-2 AAC Standard or the MPEG-4 Audio Standard, the metadata calculator 34 calculates a DRC compression value based on an assumption that the level reference playback is increased to a value of -14 dBps to -8 dBps. Empirical results indicate that a reference level of -11 dBps offers good results for many applications. Formatter 38 can join the parameter value for the DRC profile in portions of each coded signal frame as described above for the differential parameters. The use of these portions of the frames allows the encoded signal to be compatible with all decoders compatible with the respective standard. E. Deployment
[00050] Devices that incorporate various aspects of the present invention can be deployed in a variety of ways including software for execution by a computer or some other device that includes more specialized components such as digital signal processor (DSP) circuitry coupled to components similar to those found on a general purpose computer. Figure 6 is a schematic block diagram of a device 70 that can be used to implement aspects of the present invention. Processor 72 provides computing resources. RAM 73 is the system random access memory (RAM) used by processor 72 for processing. ROM 74 represents some form of persistent storage such as read-only memory (ROM) for storing programs needed to operate the device 70 and possibly to perform various aspects of the present invention. I / O control 75 represents a set of interface circuits for receiving input signals and transmitting output signals through communication channels 76, 77. In the mode shown, all main system components are connected to bus 71, which it can represent more than a physical or logical bus; however, a bus architecture is not required to implement the present invention.
[00051] In modalities deployed by a general purpose computer system, additional components can be included to relate to devices such as a keyboard or mouse and a monitor, and to control a storage device 78 that has such a storage medium such as a disk or magnetic tape, or an optical medium. The storage medium can be used to record instruction programs for applications, utilities and operating systems, and can include programs that implement various aspects of the present invention.
[00052] The functions required to practice various aspects of the present invention can be performed by components that are deployed in a wide variety of ways including discrete logic components, integrated circuits, one or more ASICs and / or program controlled processors. The manner in which these components are implanted is not important for the present invention.
[00053] Software deployments of the present invention can be transmitted by a variety of machine-readable media such as modulated or baseband communication paths across the spectrum including supersonic to ultraviolet frequencies, or storage media that transmit information through the use of essentially any recording technology including tape, magnetic cards or discs, optical discs or cards, and detectable markings on media including paper.
权利要求:
Claims (14)
[0001]
1. Method for decoding an encoded input signal to generate an audio output signal, comprising the steps of: receiving the encoded input signal that includes encoded audio information and associated metadata that includes one or more decoding control parameters and one or more first parameters that specify dynamic extension compression according to a first dynamic extension compression profile and optionally include one or more second parameters that specify dynamic extension compression according to a second compression profile of dynamic extension, where the one or more first parameters have values that are established according to a coding process that generated the encoded audio information to represent the auditory stimuli with amplitudes that do not exceed a cut-out level for reproduction in a first level reference reproduction, and where the one or more second parameters have q values ue are adjusted according to the encoding process that generated the encoded audio information to represent the auditory stimuli with amplitudes that do not exceed the cutout level for reproduction at a second reference reproduction level that is greater than the first reproduction level of reference; apply a decoding process to encoded audio information to obtain sub-band signals that represent spectral content of auditory stimuli, in which the decoding process is adapted in response to one or more decoding control parameters; characterized by the fact that it still comprises: modifying the subband signals to obtain modified subband signals with altered dynamic range characteristics, where the modification is adapted in response to one or more second parameters if the metadata includes the one or more second parameters or is adapted in response to one or more first parameters if the metadata does not include the one or more second parameters; applying a synthesis filter bank to the modified subband signals to obtain a time domain audio signal; and if the metadata does not include the one or more second parameters, apply a gain and a limiter to the time domain audio signal in response to the metadata, where the gain application modifies the time domain audio signal to obtain the audio output signal with amplitudes for reproduction in the second reference reproduction level, and in which the application of the limiter prevents the amplitudes of the audio output signal from exceeding the clipping level.
[0002]
2. Method, according to claim 1, characterized by the fact that the one or more second parameters represent differences between corresponding parameters for the first dynamic extension compression profile and the second dynamic extension compression profile.
[0003]
3. Method according to claim 1 or 2, characterized by the fact that the encoded input signal conforms to the ATSC Standard, the MPEG-2 AAC Standard or the MPEG-4 Audio Standard, the first level of reproduction of reference corresponds to an amplitude of 20 dB below the clipping level, and the second reference reproduction level corresponds to an amplitude of 11 dB below the clipping level.
[0004]
4. Method for encoding an audio input signal that represents auditory stimuli, comprising the steps of: receiving the audio input signal; applying an analysis filter bank to the audio input signal to generate subband signals that represent spectral content of the audio input signal; analyze one or more signals derived from the audio input signal to calculate metadata that includes one or more first parameters that specify dynamic extension compression according to a first dynamic extension compression profile and one or more second parameters that specify compression dynamic extension according to a second dynamic extension compression profile, in which the one or more first parameters have values that are adjusted to represent the auditory stimuli with amplitudes that do not exceed a cut-out level for reproduction in a first reproduction level reference, and in which the one or more second parameters have values that are adjusted to represent the auditory stimuli with amplitudes that do not exceed the cut-off level for reproduction in a second reference reproduction level; apply an encoding process to subband signals to obtain encoded audio information; characterized by the fact that it still comprises: joining the encoded audio information and metadata into an encoded output signal that has a format suitable for transmission or storage, in which the one or more second parameters represent differences between corresponding parameters for the first profile dynamic extension compression profile and the second dynamic extension compression profile.
[0005]
5. Method, according to claim 4, characterized by the fact that the encoded output signal conforms to the ATSC Standard, the MPEG-2 AAC Standard or the MPEG-4 Audio Standard, the first reference reproduction level corresponds at an amplitude of 20 dB below the clipping level and the second reference reproduction level corresponds to an amplitude of 11 dB below the clipping level.
[0006]
6. Method for transcoding an encoded input signal to generate an encoded output signal, characterized by the fact that it comprises the steps of: receiving the encoded input signal that includes the first encoded audio information and associated metadata that includes one or more decoding control parameters and one or more first parameters that specify dynamic extension compression according to a first dynamic extension compression profile, where the one or more first parameters have values that are adjusted according to a first coding that generated the first coded audio information to represent auditory stimuli with amplitudes that do not exceed a cut-out level for reproduction in a first reference reproduction level; apply a decoding process to the first encoded audio information to obtain subband signals that represent the spectral content of the auditory stimuli, in which the decoding process is adapted in response to one or more decoding control parameters; analyze one or more signals obtained from the subband signals to calculate one or more second parameters that specify dynamic extension compression according to a second dynamic extension compression profile, where the one or more second parameters have values that are adjusted to represent auditory stimuli with amplitudes that do not exceed the cut-off level for reproduction in a second reference reproduction level; and joining the second encoded audio information, the one or more first parameters and the one or more second parameters into an encoded output signal that has a format suitable for transmission or storage, where the second encoded audio information is a coded representation subband signals.
[0007]
Method according to claim 6, characterized by the fact that the one or more second parameters represent differences between corresponding parameters for the first dynamic extension compression profile and the second dynamic extension compression profile.
[0008]
8. Method according to claim 6 or 7, characterized in that it comprises applying a synthesis filter bank to the subband signals to obtain the one or more signals that are analyzed to calculate the one or more second parameters that specify dynamic extension compression.
[0009]
Method according to any one of claims 6 to 8, characterized in that it comprises applying a second encoding process to the subband signals to generate the second encoded audio information.
[0010]
Method according to any one of claims 6 to 8, characterized in that the second encoded audio information is the first encoded audio information.
[0011]
11. Method according to any of claims 6 to 10, characterized in that the encoded input signal conforms to the ATSC Standard, the MPEG-2 AAC Standard or the MPEG-4 Audio Standard, and the first level reference reproduction corresponds to an amplitude of 20 dB below the clipping level.
[0012]
12. Method according to any of claims 6 to 10, characterized in that the encoded output signal conforms to the ATSC Standard, the MPEG-2 AAC Standard or the MPEG-4 Audio Standard, and the second level reference reproduction corresponds to an amplitude of 11 dB below the clipping level.
[0013]
13. Apparatus characterized by the fact that it comprises means for performing the steps of the method as defined in any of claims 1 to 12.
[0014]
14. Storage medium in a device characterized by the fact that it is to perform the steps of the method as defined in any of claims 1 to 12.
类似技术:
公开号 | 公开日 | 专利标题
BR112012019880B1|2020-10-13|method for decoding an encoded input signal to generate an audio output signal, method for encoding an audio input signal that represents auditory stimuli, method for transcoding an encoded input signal to generate an encoded output signal, apparatus and medium of storage
JP6851523B2|2021-03-31|Loudness and dynamic range optimization across different playback devices
US20110170710A1|2011-07-14|Method and apparatus for adjusting volume
US9449082B2|2016-09-20|Systems and methods for dynamic audio processing
BR112013005958B1|2021-04-20|method for mixing two audio input signals into a single mixed audio signal, device for mixing signals, processor-readable storage medium and device for mixing audio input signals into a single mixed audio signal
RU2009116276A|2010-11-10|METHODS AND DEVICES FOR CODING AND DECODING OF AUDIO SIGNALS BASED ON OBJECTS
同族专利:
公开号 | 公开日
US8903729B2|2014-12-02|
JP2013519918A|2013-05-30|
CO6511277A2|2012-08-31|
US10566006B2|2020-02-18|
US20120310654A1|2012-12-06|
EA201270712A1|2013-01-30|
US20150043754A1|2015-02-12|
SG182632A1|2012-08-30|
CA2995461A1|2011-08-18|
CA3114177A1|2011-08-18|
CL2012002213A1|2012-11-30|
US20190325886A1|2019-10-24|
CA3075793C|2021-05-18|
KR20120124484A|2012-11-13|
AR080156A1|2012-03-14|
EP2534656A1|2012-12-19|
US10418045B2|2019-09-17|
US9646622B2|2017-05-09|
CA2995461C|2020-04-28|
MY169981A|2019-06-19|
JP6133263B2|2017-05-24|
BR122019025627B1|2021-01-19|
UA105277C2|2014-04-25|
JP2015045886A|2015-03-12|
EP2534656B1|2018-09-05|
TW201506912A|2015-02-16|
EP3444816A1|2019-02-20|
US20170213566A1|2017-07-27|
EA023730B1|2016-07-29|
CN102754151A|2012-10-24|
BR112012019880A2|2016-04-26|
US20200176008A1|2020-06-04|
CN102754151B|2014-03-05|
CA2787466A1|2011-08-18|
CA2787466C|2016-04-05|
CA2918302A1|2011-08-18|
WO2011100155A1|2011-08-18|
CN103795364B|2016-08-24|
TWI529703B|2016-04-11|
KR101381588B1|2014-04-17|
CA2918302C|2018-04-03|
TW201205559A|2012-02-01|
CA3075793A1|2011-08-18|
CN103795364A|2014-05-14|
MX2012008954A|2012-08-23|
EA023730B9|2016-11-30|
TWI447709B|2014-08-01|
JP5666625B2|2015-02-12|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题

GB2373975B|2001-03-30|2005-04-13|Sony Uk Ltd|Digital audio signal processing|
US7072477B1|2002-07-09|2006-07-04|Apple Computer, Inc.|Method and apparatus for automatically normalizing a perceived volume level in a digitally encoded file|
US7454331B2|2002-08-30|2008-11-18|Dolby Laboratories Licensing Corporation|Controlling loudness of speech in signals that contain speech and other types of audio material|
US7617109B2|2004-07-01|2009-11-10|Dolby Laboratories Licensing Corporation|Method for correcting metadata affecting the playback loudness and dynamic range of audio information|
US7729673B2|2004-12-30|2010-06-01|Sony Ericsson Mobile Communications Ab|Method and apparatus for multichannel signal limiting|
JP2007109328A|2005-10-14|2007-04-26|Kenwood Corp|Reproducing device|
US20080025530A1|2006-07-26|2008-01-31|Sony Ericsson Mobile Communications Ab|Method and apparatus for normalizing sound playback loudness|
EP2063418A4|2006-09-15|2010-12-15|Panasonic Corp|Audio encoding device and audio encoding method|
DE102006048685A1|2006-10-14|2008-04-17|Mtu Aero Engines Gmbh|Turbine blade of a gas turbine|
US20090253457A1|2008-04-04|2009-10-08|Apple Inc.|Audio signal processing for certification enhancement in a handheld wireless communications device|
US9373339B2|2008-05-12|2016-06-21|Broadcom Corporation|Speech intelligibility enhancement system and method|
US9197181B2|2008-05-12|2015-11-24|Broadcom Corporation|Loudness enhancement system and method|
US8315396B2|2008-07-17|2012-11-20|Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V.|Apparatus and method for generating audio output signals using object based metadata|
EP2149983A1|2008-07-29|2010-02-03|Lg Electronics Inc.|A method and an apparatus for processing an audio signal|
US8798776B2|2008-09-30|2014-08-05|Dolby International Ab|Transcoding of audio metadata|
JP2010135906A|2008-12-02|2010-06-17|Sony Corp|Clipping prevention device and clipping prevention method|
TWI447709B|2010-02-11|2014-08-01|Dolby Lab Licensing Corp|System and method for non-destructively normalizing loudness of audio signals within portable devices|
PL2381574T3|2010-04-22|2015-05-29|Fraunhofer Ges Forschung|Apparatus and method for modifying an input audio signal|
JP5903758B2|2010-09-08|2016-04-13|ソニー株式会社|Signal processing apparatus and method, program, and data recording medium|
US8989884B2|2011-01-11|2015-03-24|Apple Inc.|Automatic audio configuration based on an audio output device|
JP2012235310A|2011-04-28|2012-11-29|Sony Corp|Signal processing apparatus and method, program, and data recording medium|
US8965774B2|2011-08-23|2015-02-24|Apple Inc.|Automatic detection of audio compression parameters|
JP5845760B2|2011-09-15|2016-01-20|ソニー株式会社|Audio processing apparatus and method, and program|
JP2013102411A|2011-10-14|2013-05-23|Sony Corp|Audio signal processing apparatus, audio signal processing method, and program|
KR101594480B1|2011-12-15|2016-02-26|프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝에. 베.|Apparatus, method and computer programm for avoiding clipping artefacts|
TWI517142B|2012-07-02|2016-01-11|Sony Corp|Audio decoding apparatus and method, audio coding apparatus and method, and program|
EP2757558A1|2013-01-18|2014-07-23|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|Time domain level adjustment for audio signal decoding or encoding|
MX351187B|2013-01-28|2017-10-04|Fraunhofer Ges Forschung|Method and apparatus for normalized audio playback of media with and without embedded loudness metadata on new media devices.|
US9607624B2|2013-03-29|2017-03-28|Apple Inc.|Metadata driven dynamic range control|
US9559651B2|2013-03-29|2017-01-31|Apple Inc.|Metadata for loudness and dynamic range control|
JP2015050685A|2013-09-03|2015-03-16|ソニー株式会社|Audio signal processor and method and program|
EP3048609A4|2013-09-19|2017-05-03|Sony Corporation|Encoding device and method, decoding device and method, and program|
US9300268B2|2013-10-18|2016-03-29|Apple Inc.|Content aware audio ducking|
SG11201603116XA|2013-10-22|2016-05-30|Fraunhofer Ges Zur Förderung Der Angewandten Forschung E V|Concept for combined dynamic range compression and guided clipping prevention for audio devices|
US9240763B2|2013-11-25|2016-01-19|Apple Inc.|Loudness normalization based on user feedback|
US9276544B2|2013-12-10|2016-03-01|Apple Inc.|Dynamic range control gain encoding|
AU2014371411A1|2013-12-27|2016-06-23|Sony Corporation|Decoding device, method, and program|
US9608588B2|2014-01-22|2017-03-28|Apple Inc.|Dynamic range control with large look-ahead|
US9654076B2|2014-03-25|2017-05-16|Apple Inc.|Metadata for ducking control|
EP3123469B1|2014-03-25|2018-04-18|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|Audio encoder device and an audio decoder device having efficient gain coding in dynamic range control|
PT3522554T|2014-05-28|2021-01-06|Fraunhofer Ges Forschung|Data processor and transport of user control data to audio decoders and renderers|
EP3151240A4|2014-05-30|2018-01-24|Sony Corporation|Information processing device and information processing method|
CN113851138A|2014-06-30|2021-12-28|索尼公司|Information processing apparatus, information processing method, and computer program|
TWI631835B|2014-11-12|2018-08-01|弗勞恩霍夫爾協會|Decoder for decoding a media signal and encoder for encoding secondary media data comprising metadata or control data for primary media data|
US20160315722A1|2015-04-22|2016-10-27|Apple Inc.|Audio stem delivery and control|
US10109288B2|2015-05-27|2018-10-23|Apple Inc.|Dynamic range and peak control in audio using nonlinear filters|
CN108028631A|2015-05-29|2018-05-11|弗劳恩霍夫应用研究促进协会|Apparatus and method for volume control|
RU2685999C1|2015-06-17|2019-04-23|Фраунхофер-Гезелльшафт Цур Фердерунг Дер Ангевандтен Форшунг Е.Ф.|Volume control for user interactivity in the audio coding systems|
US9934790B2|2015-07-31|2018-04-03|Apple Inc.|Encoded audio metadata-based equalization|
US9837086B2|2015-07-31|2017-12-05|Apple Inc.|Encoded audio extended metadata-based dynamic range control|
US10341770B2|2015-09-30|2019-07-02|Apple Inc.|Encoded audio metadata-based loudness equalization and dynamic equalization during DRC|US10848118B2|2004-08-10|2020-11-24|Bongiovi Acoustics Llc|System and method for digital signal processing|
US10158337B2|2004-08-10|2018-12-18|Bongiovi Acoustics Llc|System and method for digital signal processing|
US10848867B2|2006-02-07|2020-11-24|Bongiovi Acoustics Llc|System and method for digital signal processing|
US10701505B2|2006-02-07|2020-06-30|Bongiovi Acoustics Llc.|System, method, and apparatus for generating and digitally processing a head related audio transfer function|
AR077680A1|2009-08-07|2011-09-14|Dolby Int Ab|DATA FLOW AUTHENTICATION|
TWI413110B|2009-10-06|2013-10-21|Dolby Int Ab|Efficient multichannel signal processing by selective channel decoding|
EP2491560B1|2009-10-19|2016-12-21|Dolby International AB|Metadata time marking information for indicating a section of an audio object|
TWI447709B|2010-02-11|2014-08-01|Dolby Lab Licensing Corp|System and method for non-destructively normalizing loudness of audio signals within portable devices|
TWI525987B|2010-03-10|2016-03-11|杜比實驗室特許公司|System for combining loudness measurements in a single playback mode|
JP5850216B2|2010-04-13|2016-02-03|ソニー株式会社|Signal processing apparatus and method, encoding apparatus and method, decoding apparatus and method, and program|
JP5707842B2|2010-10-15|2015-04-30|ソニー株式会社|Encoding apparatus and method, decoding apparatus and method, and program|
JP5719966B2|2011-04-08|2015-05-20|ドルビー ラボラトリーズ ライセンシング コーポレイション|Automatic configuration of metadata for use in mixing audio streams from two encoded bitstreams|
EP3547312A1|2012-05-18|2019-10-02|Dolby Laboratories Licensing Corp.|System and method for dynamic range control of an audio signal|
KR101726205B1|2012-11-07|2017-04-12|돌비 인터네셔널 에이비|Reduced complexity converter snr calculation|
US9411881B2|2012-11-13|2016-08-09|Dolby International Ab|System and method for high dynamic range audio distribution|
JP6129348B2|2013-01-21|2017-05-17|ドルビー ラボラトリーズ ライセンシング コーポレイション|Optimization of loudness and dynamic range across different playback devices|
SG11201502405RA|2013-01-21|2015-04-29|Dolby Lab Licensing Corp|Audio encoder and decoder with program loudness and boundary metadata|
CN203134365U|2013-01-21|2013-08-14|杜比实验室特许公司|Audio frequency decoder for audio processing by using loudness processing state metadata|
EP2959479B1|2013-02-21|2019-07-03|Dolby International AB|Methods for parametric multi-channel encoding|
US9559651B2|2013-03-29|2017-01-31|Apple Inc.|Metadata for loudness and dynamic range control|
US9607624B2|2013-03-29|2017-03-28|Apple Inc.|Metadata driven dynamic range control|
US9883318B2|2013-06-12|2018-01-30|Bongiovi Acoustics Llc|System and method for stereo field enhancement in two-channel audio systems|
TWM487509U|2013-06-19|2014-10-01|杜比實驗室特許公司|Audio processing apparatus and electrical device|
EP3044876B1|2013-09-12|2019-04-10|Dolby Laboratories Licensing Corporation|Dynamic range control for a wide variety of playback environments|
EP3044786A1|2013-09-12|2016-07-20|Dolby Laboratories Licensing Corporation|Loudness adjustment for downmixed audio content|
EP3048609A4|2013-09-19|2017-05-03|Sony Corporation|Encoding device and method, decoding device and method, and program|
SG11201603116XA|2013-10-22|2016-05-30|Fraunhofer Ges Zur Förderung Der Angewandten Forschung E V|Concept for combined dynamic range compression and guided clipping prevention for audio devices|
US9906858B2|2013-10-22|2018-02-27|Bongiovi Acoustics Llc|System and method for digital signal processing|
US20150146099A1|2013-11-25|2015-05-28|Anthony Bongiovi|In-line signal processor|
US9276544B2|2013-12-10|2016-03-01|Apple Inc.|Dynamic range control gain encoding|
AU2014371411A1|2013-12-27|2016-06-23|Sony Corporation|Decoding device, method, and program|
US9608588B2|2014-01-22|2017-03-28|Apple Inc.|Dynamic range control with large look-ahead|
US10063207B2|2014-02-27|2018-08-28|Dts, Inc.|Object-based audio loudness management|
EP3111670A1|2014-02-27|2017-01-04|Sonarworks SIA|Method of and apparatus for determining an equalization filter|
CN109087653A|2014-03-24|2018-12-25|杜比国际公司|To the method and apparatus of high-order clear stereo signal application dynamic range compression|
EP3123469B1|2014-03-25|2018-04-18|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|Audio encoder device and an audio decoder device having efficient gain coding in dynamic range control|
CN106464943B|2014-06-30|2020-09-11|索尼公司|Information processing apparatus and method|
EP2963948A1|2014-07-02|2016-01-06|Thomson Licensing|Method and apparatus for encoding/decoding of directions of dominant directional signals within subbands of a HOA signal representation|
JP6727194B2|2014-10-01|2020-07-22|ドルビー・インターナショナル・アーベー|Efficient DRC profile transmission|
CN112185401A|2014-10-10|2021-01-05|杜比实验室特许公司|Program loudness based on transmission-independent representations|
US10109288B2|2015-05-27|2018-10-23|Apple Inc.|Dynamic range and peak control in audio using nonlinear filters|
US9837086B2|2015-07-31|2017-12-05|Apple Inc.|Encoded audio extended metadata-based dynamic range control|
US9590580B1|2015-09-13|2017-03-07|Guoguang Electric Company Limited|Loudness-based audio-signal compensation|
TWI594231B|2016-12-23|2017-08-01|瑞軒科技股份有限公司|Multi-band compression circuit, audio signal processing method and audio signal processing system|
US11089349B2|2017-01-20|2021-08-10|Hanwha Techwin Co., Ltd.|Apparatus and method for playing back and seeking media in web browser|
CN110033781B|2018-01-10|2021-06-01|盛微先进科技股份有限公司|Audio processing method, apparatus and non-transitory computer readable medium|
AU2019252524A1|2018-04-11|2020-11-05|Bongiovi Acoustics Llc|Audio enhanced hearing protection system|
US10959035B2|2018-08-02|2021-03-23|Bongiovi Acoustics Llc|System, method, and apparatus for generating and digitally processing a head related audio transfer function|
CN112992159B|2021-05-17|2021-08-06|北京百瑞互联技术有限公司|LC3 audio encoding and decoding method, device, equipment and storage medium|
法律状态:
2019-01-08| B06F| Objections, documents and/or translations needed after an examination request according [chapter 6.6 patent gazette]|
2019-09-10| B06U| Preliminary requirement: requests with searches performed by other patent offices: procedure suspended [chapter 6.21 patent gazette]|
2020-05-19| B09A| Decision: intention to grant [chapter 9.1 patent gazette]|
2020-10-13| B16A| Patent or certificate of addition of invention granted [chapter 16.1 patent gazette]|Free format text: PRAZO DE VALIDADE: 20 (VINTE) ANOS CONTADOS A PARTIR DE 03/02/2011, OBSERVADAS AS CONDICOES LEGAIS. |
优先权:
申请号 | 申请日 | 专利标题
US30364310P| true| 2010-02-11|2010-02-11|
US61/303,643|2010-02-11|
PCT/US2011/023531|WO2011100155A1|2010-02-11|2011-02-03|System and method for non-destructively normalizing loudness of audio signals within portable devices|BR122019025627-6A| BR122019025627B1|2010-02-11|2011-02-03|method and apparatus for decoding an encoded input signal to generate an audio output signal, non-transient means in a device for performing a method for decoding an encoded input signal to generate an audio output signal, method and apparatus|
[返回顶部]